Time manipulation technique for speeding up reinforcement learning in simulations
نویسندگان
چکیده
A technique for speeding up reinforcement learning algorithms by using time manipulation is proposed. It is applicable to failure-avoidance control problems running in a computer simulation. Turning the time of the simulation backwards on failure events is shown to speed up the learning by 260% and improve the state space exploration by 12% on the cart-pole balancing task, compared to the conventional Q-learning and Actor-Critic algorithms.
منابع مشابه
Time Hopping technique for faster reinforcement learning in simulations
A technique called Time Hopping is proposed for speeding up reinforcement learning algorithms. It is applicable to continuous optimization problems running in computer simulations. Making shortcuts in time by hopping between distant states combined with off-policy reinforcement learning allows the technique to maintain higher learning rate. Experiments on a simulated biped crawling robot confir...
متن کاملSpeeding up Tabular Reinforcement Learning Using State-Action Similarities
One of the most prominent approaches for speeding up reinforcement learning is injecting human prior knowledge into the learning agent. This paper proposes a novel method to speed up temporal difference learning by using state-action similarities. These handcoded similarities are tested in three well-studied domains of varying complexity, demonstrating our approach’s benefits.
متن کاملSpeeding Up Reinforcement Learning with Behavior Transfer
Reinforcement learning (RL) methods (Sutton & Barto 1998) have become popular machine learning techniques in recent years. RL has had some experimental successes and has been shown to exhibit some desirable properties in theory, but it has often been found very slow in practice. In this paper we introduce behavior transfer, a novel approach to speeding up traditional RL. We present experimental...
متن کاملTime Hopping Technique for Reinforcement Learning and its Application to Robot Control
To speed up the convergence of reinforcement learning (RL) algorithms by more efficient use of computer simulations, three algorithmic techniques are proposed: Time Manipulation, Time Hopping, and Eligibility Propagation. They are evaluated on various robot control tasks. The proposed Time Manipulation [1] is a concept of manipulating the time inside a simulation and using it as a tool to speed...
متن کاملLocking in Returns: Speeding Up Q-Learning by Scaling
One problem common to many reinforcement learning algorithms is their need for large amounts of training, resulting in a variety of methods for speeding up these algorithms. We propose a novel method that is remarkable both for its simplicity and its utility in speeding up Q-learning. It operates by scaling the values in the Q-table after limited, typically small, amounts of learning. Empirical...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/0903.4930 شماره
صفحات -
تاریخ انتشار 2008